Noisy Or-based model for Relation Extraction using Distant Supervision

نویسندگان

  • Ajay Nagesh
  • Gholamreza Haffari
  • Ganesh Ramakrishnan
چکیده

Distant supervision, a paradigm of relation extraction where training data is created by aligning facts in a database with a large unannotated corpus, is an attractive approach for training relation extractors. Various models are proposed in recent literature to align the facts in the database to their mentions in the corpus. In this paper, we discuss and critically analyse a popular alignment strategy called the “at least one” heuristic. We provide a simple, yet effective relaxation to this strategy. We formulate the inference procedures in training as integer linear programming (ILP) problems and implement the relaxation to the “at least one ” heuristic via a soft constraint in this formulation. Empirically, we demonstrate that this simple strategy leads to a better performance under certain settings over the existing approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Distinctive Features in Distant Supervision for Relation Extraction

Distant supervision (DS) for relation extraction suffers from the noisy labeling problem. Most solutions try to model the noisy instances in the form of multi-instance learning. However, in the non-noisy instances, there may be noisy features which would harm the extraction model. In this paper, we employ a novel approach to address this problem by exploring distinctive features and assigning d...

متن کامل

Reducing Wrong Labels in Distant Supervision for Relation Extraction

In relation extraction, distant supervision seeks to extract relations between entities from text by using a knowledge base, such as Freebase, as a source of supervision. When a sentence and a knowledge base refer to the same entity pair, this approach heuristically labels the sentence with the corresponding relation in the knowledge base. However, this heuristic can fail with the result that s...

متن کامل

Removing Noisy Mentions for Distant Supervision Eliminando Menciones Ruidosas para la Supervisión a Distancia

Relation Extraction methods based on Distant Supervision rely on true tuples to retrieve noisy mentions, which are then used to train traditional supervised relation extraction methods. In this paper we analyze the sources of noise in the mentions, and explore simple methods to filter out noisy mentions. The results show that a combination of mention frequency cut-off, Pointwise Mutual Informat...

متن کامل

Combining Generative and Discriminative Model Scores for Distant Supervision

Distant supervision is a scheme to generate noisy training data for relation extraction by aligning entities of a knowledge base with text. In this work we combine the output of a discriminative at-least-one learner with that of a generative hierarchical topic model to reduce the noise in distant supervision data. The combination significantly increases the ranking quality of extracted facts an...

متن کامل

Modeling Relations and Their Mentions without Labeled Text

Several recent works on relation extraction have been applying the distant supervision paradigm: instead of relying on annotated text to learn how to predict relations, they employ existing knowledge bases (KBs) as source of supervision. Crucially, these approaches are trained based on the assumption that each sentence which mentions the two related entities is an expression of the given relati...

متن کامل

Semantic Consistency: A Local Subspace Based Method for Distant Supervised Relation Extraction

One fundamental problem of distant supervision is the noisy training corpus problem. In this paper, we propose a new distant supervision method, called Semantic Consistency, which can identify reliable instances from noisy instances by inspecting whether an instance is located in a semantically consistent region. Specifically, we propose a semantic consistency model, which first models the loca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014